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Abstract 

Background: Many scoring systems exist for clocl< drawing tasl< variants. However, none of tliem are reliable in evaluating 
longitudinal changes of cognitive function. The purpose of this study is to create a simple yet optimal scoring procedure to 
evaluate cognitive decline using a clinic-based sample. 

/Wef/jocfe/Clock-drawings from 121 participants (76 individuals with no dementia and later did not develop dementia after a 
mean 41.2-month follow-up, 45 individuals with no dementia became demented after a mean 42.3-month follow-up) were 
analyzed using t-test to determine a new and simplified CDT scoring system. The new scoring method was then compared 
with other commonly used systems. 

Results: In the converters, there were only 7 items that are significantly different between the initial visits and the second 
visits. We propose a new scoring system that includes the seven critical items: numbers are equally spaced (12-3-6-9) 
(p = 0.031), the other eight numbers are marked {p = 0.022), numbers are clockwise {p = 0.002), all numbers are correct 
(p = 0.030), distance between numbers is constant (p = 0.016), clock has two hands (p = 0.000), arrows are drawn (p = 0.003). 
Compared with other traditionally used scoring methods, this based change clock drawing test (BCCDT) has one of the most 
balanced sensitivities/specificities with a clinic-based sample. 

Conclusions: Ihe new CDT scoring system provides further evidence in support of a simple and reliable clock-drawing 
scoring system in follow-up studies to evaluate cognitive decline, which can be used in assessing the efficacy of medicine. 
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Introduction 

Neuropsychological evaluations are an integral part of a 
complete geriatric evaluation used to diagnose dementia. The 
clock drawing test (CDT), widely acknowledged for its simplicity 
and ease of administration, is a measure used to detect cognitive 
decline associated with a variety of neurobehavioral disorders. 
Moreover, the CDT requires different cognitive abilities including 
auditory and visual comprehension, concentration, visuospatial 
abilities, abstract conceptualization, and executive control [1]. 
Deficits in these areas reflect possible frontal and temporoparietal 
disturbances that are often exhibited in Alzheimer disease (AD) 
[2-4], and that may not easily be detected by commonly-used 
cognitive screening tests such as the Mini-Mental State Exam 
(MMSE) [5]. Correlating highly with the MMSE [6] and other 
measures of global cognitive decline, the CDT serves as a simple 
and nonthreatening cognitive screen, rendering it a popular tool in 
both clinical and research practices [5], [7]. 

In the past 30 years, many variations of the Clock Drawing Test 
(CDT) have risen to the forefront as a dementia screening measure 
[6], [8-17] (see table 1). The scoring system by Shulman et al. in 
1986 [18] was one of the oldest methods. Sunderland et al. [9] 



used a 10-point anchored system based on preset criteria with an 
arbitrary cut-off at 6 points. They found that interrater reliability 
was high in clinicians and non-clinicians. However, this scale 
proved difficult to apply according to the criteria provided since it 
assumes that the representation of the hands is first and entirely 
affected, and other errors in the representation of numbers and the 
clock face occur later. Therefore, some drawings received very low 
scores for minor errors in the representation of numbers even 
though the hands were properly placed. In 1 989, W olf-Klein et al. 
[6] tested patients who were admitted consecutively to a nursing 
facility without preselection, although the group with AD was 
older than the normal group. The 10 anchor points pertain only to 
the spacing of the numbers; time setting is not assessed, therefore 
their system was less sensitive to problems with executive 
functioning. Sample anchor points include: 10 'normal'; 7 'very 
inappropriate spacing'; 4 'counter- clockwise rotation'; and 1 
'irrelevant figures'. They reported a sensitivity of 75.2% and a 
specificity of 97.7% for distinguishing between demented and 
"mentally normal elderly." The Clock Completion Test of Watson 
et al. [1 1] is an objective and simple scoring method. The subject 
is asked to place all the numbers in the clock, but not to set a time. 
Consequendy, the scoring is only based on the position of the 
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Table 1. Characteristics of clock drawing test scoring systems. 



Reference Pre-drawn clock Time setting Scoring criteria and range Correlation with other measures 



Shulman et al. (1986) 


Yes 


11:10 


5 points awarded for "perfect" clock, 4 points for 


MMSE= -0.65, SPMSQ= -0.66, 








clock containing minor visuospatial errors, 3 points 


GDS = -0.32 








for acceptable visuospatial organization but inaccurate 










representation of 10 past 11, 2 points for moderate 










visuospatial disorganization of numbers, 1 point for a 










severe level of visuospatial disorganization and 0 










points for inability to make any reasonable attempt. 




Sunderland et al. (1989) 


No 


2:45 


10-point scoring system with 1 as the lowest score 


GDS(r = 0.56),DRS(r = 0.59), BDRS(r = 0.51 ), 








and 10 as the highest score. 5 points given for 


SPMSQ(r = 0.59,p<0.001) 








accurate drawing a clock face with numbers placed 










correctly; remaining 6-10 points awarded for accuracy 










of hands denoting the time 2:45. Cut-off score of 6/10 










indicates normal cognitive functioning. 




Wolf-Klein et al. (1989) 


Yes 


No 


10-point system with scores corresponding to 10 


Not assessed 








hierarchical clock patterns from a previous pilot study. 










Cut-off score of less than 7 indicating "abnormal." 




Watson et al. (1993) 


Yes 


No 


Clock is divided into four quadrants with the 


Not assessed 



greatest weight assigned to the fourth quadrant 
(numbers 9-12). Each error falling into quadrants one, 
two and three contributes a score of 1, and each error 
in the fourth quadrant contributes a score of 4. Score 
of 0-3 indicates normality, whereas a score of 4 
or greater indicates abnormality. 



Mendez et al.{1992) 


No 


11:10 


20-item scale with each clock attribute independently 
scored as a dichotomous variable. Attributes based 
on analysis of frequency of errors In clock drawing 
test. 


Rey Complex Figure = 0.66, Symbol 
Digit = 0.65, IV1MSE = 0.45, CDS = 0.40 


Royal! et al.(1998) 


No 


1:45 


Maximum score on the drawing task (CLOX 1) is 15 
points. IVlaximum score on the copying task (CLOX 2) 


EXIT25(r = 0.78, p<0.001), MIVISE (r = 0.76, 
p<0.001) 



is 15 points. Lower scores indicate impairment. 
Cut-off scores of 10/15 (drawing task) and 12/15 
(copying task) to indicate normal functioning. Points 
are awarded based on the answers to a set of 15 
questions (e.g., Does figure resemble a clock? Outer 
circle present?) 



Rouleau et al. (1992) 


No 


11:10 


10-point scale that independently assesses three Not assessed 
subscales: (1) representation of clock face (maximum 
of 2 points); (2) layout of numbers (maximum of 4 
points); and (3) position of hands (maximum of 4 points). 
Lower scores indicate greater impairment. 


Tuokko et al.(1992) 


Yes 


11:10 


Errors on clock drawing categorized into the following Not assessed 



classes: perseverations, omissions, rotations, 
misplacements, distortions, substitutions and additions. 
Greater than two errors on clock drawing considered 
abnormal. Clock setting and clock reading achieve a 
maximum of 3 points. Greater than 2 errors is considered 
a positive (abnormal) result for clock drawing, whereas 
the cut-off for the clock setting and clock reading tasks 
was a score of less than 13. 



Manos and Wu (1994) Yes 


11:10 


10-point system with a transparent circle divided Trail Making Test Part A(r = 0.48, 






into eighths that acts as a scoring tool for the drawn p<0.001), MMSE(r = 0.50, p<0.001), 






clock. Points are awarded based on the numbers Block Design Test(r = 0.56, p<0.001) 






falling into their proper section and accuracy of hands. 






Cut-off score of 7/10 used by authors to indicate a 






"normal" clock. 



Lessig et al. (2008) No 8:20 or 11:10 Analyzed three existing scoring systems [8], [10], Not assessed 

[13] to isolate six specific errors that were best 
able to discriminate patients with dementia from 
those without. A final algorithm was created from 
these six errors: inaccurate time setting, missing hands, 
missing numbers, number substitutions or repetitions, 
and failure to attempt clock drawing. If any error was 
identified, the clock was classified as abnormal. 

doi:1 0.1 371 /journal.pone.0097873.t001 

numbers in the clock face. No hands are required or scored and so diagnosis of dementia. The Clock Drawing Interpretation Scale 
some sensitivity is lost. The authors report that the number of (GDIS) by Mendez et al. [10] uses 20 points distributed between 
digits in the 4th quadrant (9-12) had the best agreement with the general impression, placement of numbers and hands with a score 
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Table 2. Age, education, and interval between two assessments for individuals classified with Non-converters and Converters. 



Non-converters(n = 76) Converters(n = 45) t(P) 

Age at baseline (year) 68.8 (9.0) 69.4 (6.8) 0.340(0.735) 

Education(year) 13.4(2.6) 12.6(2.3) 1.726(0.087) 

Interval between two assessments 41.2(16.7) 42.3(18.8) 0.349(0.728) 
(month) 



doi:1 0.1 371 /journal.pone.0097873.t002 

of higher than 1 8 as being normal. The authors found that the 
presence of the number 2 and the correct location of the minute 
hand were the items most frequently absent and were absent in all 
AD patients. In 1998, Royall et al. [14] developed the CLOX test, 
a CDT scoring system, which they mention is specifically designed 
to measure executive control functions. The patient is asked to 
draw a clock on an empty page and later to copy a clock. The 
authors suggest that the difference between these tests can be a 
measure of executive control function. 

Recently, more and more researches focused on its utility on 
screening mild cognitive impairment despite the inconsistent 
results, but litde is known about the longitudinal changes in 
performance before and after cognitive decline. To our knowl- 
edge, most of the previous articles were cross-sectional, no article 
has evaluated whether individual with no dementia had progressed 
to mild Alzheimer's disease using CDT. Therefore, we conduct 
this study to investigate which aspects of clock drawing are 
important factors while assessing the characteristic changes in 
performance over time. 

Methods 

Participants 

This study was conducted at the Memory Clinic of Shanghai 
Huashan Hospital Fudan University. The cohort consisted of 
participants referred to the clinic between June 2004 and Nov 
2012 after they had finished the laboratory tests and cranial CT/ 
MRI scan and were found to have no clinically significant 
abnormalities in vitamin B12, folic acid, thyroid function (free 
triiodothyronine-FT3, free tetraiodothyroniiie-FT4, thyroid stim- 
ulating hormone-TSH), rapid plasma regain (RPR), or treponema 
pallidum particle agglutination (TPPA). During the initial visits, all 
patients were assessed by physicians experienced in dementia 
disorders, and underwent thorough physical, psychiatric and 
neurological examinations, as well as an interview that focused on 
their cognitive symptoms. All of the MCI participants were 
diagnosed according to the following which take Mayo criteria 
[37] as reference: (1) cognitive complaints verified by an 
informant; (2) cognitive impairment lasting more than 3 months; 
(3) mini-mental state examination-Chinese version (C-MMSE) > 
cut-off score for adjusted education: eduS:9 yr, 26; 6£edu<9 yr, 
22 [19]; (4) preserved basic ability of daily Hfe (ADL)/minimal 
impairment in complex instrumental functions; (5) etiology 
unknown; (6) normal hearing and sight; (7) have not met the 
diagnostic criteria for dementia based on the criteria from the 
National Institute of Neurological and Communicative Disorders 
and Stroke and the Alzheimer's Disease and Related Disorders 
Association (NINCDS-ADRDA). 

In the present study, 1 2 1 participants at baseline were included. 
The participants were followed about four years after the first 
visits. 76 participants did not convert to dementia over longitu- 
dinal follow-up with a mean duration of 41.2 months. These 



participants are termed Non-converters (mean age = 68.8 years, 
SD = 9.0; mean education = 13.4 years, SD = 2.6). Another group 
of 45 participants progressively deteriorated and were judged 
clinically as having developed Alzheimer' s Disease over longitu- 
dinal foUow-up. They are termed Converters (mean age = 69.4 
years, SD = 6.8; mean education = 12.6 years, SD = 2.3). The 
mean duration of follow-up for the converters was 42.3 months 
(SD= 18.8). AD was diagnosed as probable AD according to the 
National Institute of Neurological and Communicative Disorders 
and Stroke-Alzheimer's Disease and Related Disorders Association 
(NINCDS-ADRDA/NINCDS-AIREN) criteria. According to the 
scores he/she obtained in MMSE(Mini-Mental state examination) 
[20], CFT (complex figure test) [21], AVLT(auditory verbal 
learning test) [22], AFT(animal fluency test) [23], STT(shape trails 
test) [24], CDR(clinical dementia rating scale) [25] , 
SCWT(Stroop color word test) [26] at the both visits, the severity 
of AD was just mUd. This study was approved by the ethics 
committee of Shanghai Huashan Hospital Fudan University. All 
participants signed a consent form. 

Procedure 

To determine the general cognitive function, all study subjects 
completed MMSE(Mini-Mental state examination) [20], CFT 
(complex figure test) [21], AVLT(auditory verbal learning test) 
[22], AFT(animal fluency test) [23], STT(shape trails test) [24], 
CDR(clinical dementia rating scale) [25], SCWT(Stroop color 
word test) [26] at the both visits. 

During the clock-drawing test, participants were asked to draw a 
big circle and put the numbers of the clock, and then they were 
asked to indicate the time as "50 after 13." There was no time 
limit for this test. 

According to previous studies, we chose 18 items and classified 
them into three major components: (a) drawing planning; (b) 
numbering; (c) placement and size of the hands. Each category can 
be further subdivided into some aspects. Within this study, we 
scored each clock according to the 1 8 items, by rating 1 if correct 
and 0 if wrong. 

Moreover, five different scoring systems were used to score each 
clock blinded to the results of the rest of the assessment. We chose 
them because they were simple, representative and took the 
physicians less time. The three semi-quantitative scoring systems 
(Sunderland et al., 1989; Shulman et al., 1993; Watson et al., 
1993) focused on scoring the whole clock, while the two 
quantitative scoring systems (MOCA-CDT, 2005 [38]; Pfizer 
Inc. and Eisai Inc) focused on different aspects of the clocks (such 
as clock face, numbers or hands) and scored them separately. The 
scoring methods used in this study are as follows: (1) The CDT by 
Sunderland et al.: 10 'hands are in correct position'; 7 'placement 
of hands is significantiy off course'; 4 'further distortion of number 
sequence'; and 1 'either no attempt or an uninterpretable attempt 
is made'; (2) The CDT by Watson et al.: a clock is divided into 
quadrants and a score of 1 point is given for any error in the first 3 
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quadrants and 4 points for any error in the 4th quadrant; (3) The 
CDT by Shulman et al.: this scoring method has been better at 
predicting AD compared to other scoring methods [9] . Five points 
were given for a perfect clock and 0-4 points depending on the 
severity of the errors; (4) The CDT by Pfizer Inc. and Eisai Inc: 
clock scoring criteria are basic with 1 point per clock contour, 
numbers, numbers' location and hands; (5) MOCA-CDT: the 3- 
item scoring system scores range from 0 (normal clock) to 3 (severe 
impairment). Within the MoCA, clock drawing is one test item 
involving 3 of the total 30 points possible. Likely to maximize 
clinical time, clock scoring criteria are basic with 1 point per clock 
contour, numbers and hands. 

Data analysis 

Initial analyses (t test) examined the relationship between 
cognitive status (non-converter vs. converter) and age, years of 
education or interval between the two assessments to determine if 
these variables should be considered as covariates. 

MMSE total, CFT-Copy, CFT-Recall, AVLT-I, AVLT-II, 
AFT-total, STT-A, STT-B, CDR, CDR-SB were compared 
between non-converters at the first visit (VI) and non-converters 
at the second visit (V2), converters VI and converters V2, as well 
as non-converters V2 and converters V2 to determine the general 
cognitive function. 

We selected 18 items from CDT associated with dementia 
according to previous studies. Each of the 18 items was converted 
to a dichotomous variable (0, 1) with "0" indicating no and "1" 
indicating yes. In order to understand if any of the items could 
predict cognitive status (non-converter vs. converter), an initial t 
test was conducted between non-converters VI and converters VI. 
To find the longitudinal changes in performance before and after 
cognitive dechiie, a second t test was then conducted between VI 
and V2 in converters. Moreover, we conducted another t test 
between non-converters V2 and converters V2 to know the 
differences of the 18 items between the patients with dementia and 
the patients with no dementia. 

Once the items that significantly discriminated between 
converters VI and converters V2 had been isolated, we proposed 
a new scoring system named as based change clock drawing test 
(BCCDT). Then we compared the BCCDT with the CDT by 
Sunderland et al, the CDT by Watson et al., the CDT by 
Shulman et al., the CDT by Pfizer Inc. and Eisai Inc and MOCA- 
CDT. 

For all of the 242 assessments, the CDT scores obtained from 
the six scoring methods were correlated with each other to 
investigate the relationship between the types of scoring method. 

Comparison for continuous variables was evaluated with the 
Student t-test or the Mann- Whitney U test when the data were not 
normally distributed. 

P values and CIs were estimated in a 2 -tailed fashion. 
Difference was considered to be statistically significant at P<0.05. 

Data were analysed using statistical software (SPSS 13.0; 
Chicago, Illinois, USA). 

Results 

1. Characteristics of the participants 

During clinical follow-up, 76(63%) participants remained non- 
demented and 45(37%) participants developed dementia. We 
divided all of the participants into two groups. Non-converters and 
Converters. Initial T test revealed that age (t = 0.340, P = ns), 
education (t= 1.726, P = ns) or interval between two assessments 
(t = 0.349, P = ns) had no significant impact on the results (see 
table 2). 
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= 2. Cognitive state of the Non-converters and Converters 

a According to the cognitive state at V2, we got four groups as 

^ follows: Non-converters VI, Non-converters V2, Converters VI 

"Jj and Converters V2. The first two meant baseline visit and second 

"D 

= visit of tlie participants who did not convert to dementia over 

§ longitudinal follow-up, and the latter two indicated first visit and 

5 second visit of the participants who developed Alzheimer' s 

S Disease over longitudinal follow-up. We used T test to compare 

I MMSE total, CFT-Copy, CFT-Recall, AVLT-I, AVLT-II, AFT- 

i total, STT-A, STT-B, CDR, CDR-SB between two of the four 

groups. The result showed that most had significant difference, 

5 except for CFT-Copy between Non-converters VI and Converters 

e- VI, STT-B and CDR between Converters VI and Converters V2 

o 

% (see table 3). 

= 0. 3. Significant and non-significant items 

i! I Table 4 shows the T test conducted to assess the utility of 18 

"3 "° items in our sample. Firstly, there was only one significant item at 

=5 o baseline between converters and non-converters (t = 4.731, 

I p = 0.030), and performance in the converters were better than 

'Z S that in the non-converters, meaning that it was difficult to predict 

dementia. Secondly, for the items that were poorly finished, the 
accuracy rate of which was lower by 50% at baseline in converters, 
including "12, 3, 6, 9" are first written after the circle, "1, 2, 4, 5, 

1 8 7, 8, 10, 11" are equally spaced, hour hand is towards correct 
number, minute hand is towards correct number, minute hand is 
longer than hour hand, there was no significantly difference 
between VI and V2, showing that poorly finished items at baseline 
were not always the sensitive one to predict dementia. Thirdly, in 

% p" the converters, there were four items, the score of which was 

higher in V2 than in V 1 , indicating that those four items were not 
helpful to improve forecast value. Therefore the total score of 
CDT should not just be the addition of each item. Fourthly, at the 
second visit, there were 15 items that were significantly different 
t between non-converters and converters. But among the convert- 

t; K ers, there were only 7 items that could tell differences between VI 

2 ^ and V2, which means when comparing dementia with no 
g, 5 dementia, the sensitive items between cross-sectional comparison 
>< 5= and longitudinal comparison were not the same. Finally, there 
|- I were seven significant items that appeared to be possible markers 
8 S of progression to dementia in follow-up studies. Numbers are 
° t equally spaced (12-3-6-9) (p = 0.03 1), the other eight numbers are 
^ u marked (p — 0.022), numbers are clockwise (p = 0.002), all num- 
° ^ bers are correct (p = 0.030), distance between numbers is constant 

o (p = 0.016), clock has two hands (p = 0.000), arrows are drawn 

S: 5 (p = 0.003), all parameters indicated remarkable differences 

" S between baseline and follow-up scores in converters. The 

=3 "g o o conclusion that can be drawn here is that these seven items may 

_a; = u 

s iS H consist of a simple clock-drawing scoring system in foUow-up 

S i 'o studies to evaluate whether individual with no dementia had 

5 t_i 

— L-c progressed to dementia. We named the new scoring system based 

U > <U fD . n-1 

5 .Si q: Q- change clock drawing test (BCCDT). Another 1 1 items no longer 

^ o -2 t § proved to be major contributors. 

^ s ™ g 4. Clock performance in relation to performance on other 

c I 2 ^ ^ cognitive measures 

E s K E q. Table 5 presents the correlation coefficients between the CDT 

.S" -Si § .a c score and other cognitive measures using correlation analysis. All 



> "o 

< a; 



£ ^ 

8 s 
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= t; 

U Q. 
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<^ < 



■- S S .0^ correlations between the six CDT scores and the MMSE total 

S S J ^ ^ score, AVLT-I, AFT and STT-A were significant, with the highest 

■5 t ili o 5 correlation occurring between BCCDT and MMSE total score, 

^ o i/l .t^ ^ ° ' 

u ir* 5 3 o AVLT-I, AFT and STT-A. Sunderland scoring system and the 
— BCCDT correlated with the time of Rey -O CFT-Copy (s). 
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Table 6. Correlation coefficients between the six scoring nnethods based on the 242 assessments. 







MOCA-CDT 


Pfizer Inc. and Elsai Inc 


Shulman 


Watson 


Sunderland 


BCCDT 


MOCA-CDT 


1 


.907** 


.877** 


-.627** 


.825* 


.719** 


Pfizer Inc. and Eisai Inc 




1 


.888** 


-.692** 


.820** 


.784** 


Shulman 






1 


-.697** 


.896** 


.781** 


Watson 








1 


-.751** 


-.748** 


Sunderland 










1 


.788** 


BCCDT 1 



*Correlation is significant at the 0.05 level (two-tailed). 
**Correlation is significant at the 0.01 level (two-tailed). 
doi:1 0.1 371 /journal.pone.0097873.t006 



AVLT-II and correct number of SCWT-C, the highest correlation 
between Sunderland scoring system, BCCDT and time of Rey -O 
CFT-Copy (s), AVLT-II was obtained using BCCDT. Watson, 
Sunderland scoring system and BCCDT correlated with the CFT- 
Recall, and the three correlation coefficients were similar. MOCA- 
CDT, Shulman scoring system, Sunderland scoring system, and 
BCCDT correlated with time of SCWT-C(S), and the highest 
correlation was between BCCDT and time of SCWT-C(S). In 
conclusion, for BCCDT, it has displayed good correlation with 
other memory clinic measures (see table 5). 

5. Correlations between the six scoring methods 

Table 6 summarizes correlations between the six scoring 
methods, including BCCDT. For the total 242 assessments, the 
six systems are moderately-to-highly correlated, with the highest 
correlation occurring between the MOCA-CDT and Pfizer Inc. 
and Eisai Inc scoring method. AH correlations between BCCDT 
and others were statistically significant at the 0.01 level. 

6. The utility of BCCDT comparing with other five scoring 
systems 

T test was conducted to assess the utility of the six scoring 
systems. We found in converters, the scores at VI and V2 was 
significantly different, and p value of BCCDT (p = 0.000) was the 
smallest in the six (see Table 7). 

7. Discrimination of different scoring systems between 
non-converters and converters 

The area under the ROC curve is perhaps a more unbiased 
method to determine the efficiency of a screening test as it shows 
the relationship between sensitivity and specificity. ROC curves 
were drawn for the six scoring systems to evaluate their respective 
areas under the curve, sensitivities, and specificities (see Table 8). 
Using the optimal cut-off score of 5, the differences between the 
two groups were most discernible under BCCDT, according to the 
ROC curve (area under the curve = 0.713, p = 0.001), while the 
sensitivity and specificity remained at 78.6% and 57.1%, 
respectively. The Watson scoring method had the smallest area 
under the curve (0.571, p = 0.260). The MOCA-CDT and 
Shulman scoring systems had the highest sensitivities at 92.9% 
and 88.1%, respectively. BCCDT and Sunderland scoring 
procedures fell in the middle at 78.6%, and 73.8%, respectively. 
The Watson method had the lowest sensitivity at 54.8%, 
performing just above chance level for correcdy identifying 
individuals with dementia. With regard to specificity, BCCDT 
scoring procedure had the highest specificity at 57. 1 %, followed by 
the Sunderland scoring method at 47.6%. The Pfizer Inc. and 



Eisai Inc's specificity closely trailed the Watson scoring system at 
38.1%. Both the Shulman and MOCA-CDT procedures had the 
lowest specificities at 33.3% and 28.6%, respectively. 

Discussion 

Using the clinical sample, BCCDT was found to be effective in 
evaluating the longitudinal changes in clock drawing test (CDT) 
performance before and after cognitive decline. It includes seven 
critical items (numbers are equally spaced (12-3-6-9), the other 
eight numbers are marked, numbers are clockwise, all numbers 
are correct, distance between numbers is constant, clock has two 
hands, arrows are drawn). Further investigations should examine 
these seven items in the context of other indicators of dementia 
such as story recall and the MMSE score. 

MMSE is one of the most influential cognitive screening 
methods. It has been widely used in screening dementia and MCI. 
In previous studies, the orientation and delayed recall parts of the 
MMSE are good at predicting specifically AD [27], [28]. But with 
the clinical practice of MMSE, researchers found it was not 
sensitive enough to be used in foUow-up of cognitive function. 
Recentiy, more and more researches focused on CDT, as it could 
reflect different cognitive abilities including auditory and visual 
comprehension, concentration, visuospatial abilities, abstract 
conceptualization, and executive control [1]. However, most were 
cross-sectional studies, there were few longitudinal studies. Ji et al. 
[29] described the longitudinal changes in performance and error 
types on CDT by dementia severity and subtypes. They concluded 
that longitudinal analysis of error on CDT may reflect different 
characteristics of cognitive deterioration according to dementia 
subtypes and dementia stages. Zhou [30] used Death scoring 
systems (total score of 4) to assess the efficacy of medicine, but the 
sensitivity and specialty has not been verified. Therefore, we hope 
a suitable CDT scoring system will help to evaluate cognitive 
function longitudinally. Lennie et al. [31] found that "the clock has 
two hands, the size difference of the hands is respected, and the 
hour hand is towards correct number" were three interesting 
findings that were early discriminators for developing dementia. 
These items may be good indicators of further cognitive decline. 
Sebastian et al. [32] concluded that the MMSE and the clock 
drawing test were as accurate as CSF biomarkers in predicting 
future development of AD in patients with MCI. But in our study, 
table 4 illustrated that there was only one significant item at 
baseline between the non-converters and converters, and perfor- 
mance in the converters were better than that in the non- 
converters, which means that it was difficult to predict dementia 
using any one of the 18 items. 
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Compared with the MCI-NP (participants with mild cognitive 
impairment who did not develop dementia on foUow-up visits) 
group, participants in the MCI-D (participants with mild cognitive 
impairment who became demented after a 48-month follow-up) 
were more likely to fail the item for "size difference of the hands is 
respected." [31] However, our results showed that poorly finished 
items at baseline were not the sensitive one to predict dementia. 

A majority of studies focused on the utility of different CDT 
scoring system in screening dementia or MCI [33-35]. In this 
study, we found that when comparing dementia with no dementia, 
the sensitive items between cross-sectional comparison and 
longitudinal comparison were not the same. Therefore, the 
existing CDT scoring systems were not suitable for follow-up 
studies, and could not be used in assessing the efficacy of medicine. 

According to the scores he/she obtained in MMSE (Mini- 
Mental state examination), CFT (complex figirre test), AVLT 
(auditory verbal learning test), AFT (animal fluency test), STT 
(shape trails test), CDR (clinical dementia rating scale), SCWT 
(Stroop color word test) at both visits, the severity of AD was just 
mild. Patients who have been moderate to severe demented were 
not able to complete all of the tests. Therefore, BCCDT could be 
used to earlier recognize whether patients with MCI had 
progressed to mild AD. In addition, we discovered that the total 
score of CDT should not just be the addition of each item, as 
several items were not helpful to improve forecast value. 

In comparing the non-converters and converters, the new 
scoring method with a cut-ofiF score of 5 produced a sensitivity of 
78.6% and a specificity of 57.1%. Even though the sensitivity of 
MOCA-CDT and the Shulman scoring method surpassed the new 
scoring system's sensitivity, the specificity of the new method was 
the highest among the six systems. This comparison revealed the 
new method to be more balanced than others for screening AD. 

The correlation of the CDT with other screening tests, 
including the 'gold standard' MMSE, was good in most studies 
[15], [36], as well as in our study. We suggest that there may be a 
rationale for using both the MMSE and the CDT whilst evaluating 
longitudinal changes of cognitive function, as the MMSE measures 
are mosdy verbal skills and so could not be sensitive enough. 
However, this would considerably increase the time of adminis- 
tration. 

Because this was a longitudinal study, we may think that the 
performance decline in the seven items of BCCDT was due to 
aging. But in the non-converters, there was no significant 
difference between VI and V2. Therefore, our results should 
not be interpreted as determining the effect of aging on CDT 
performance. 

Several limitations of this study need to be considered when 
examining the results. The sample used in this study was not 
population based, but comprised clinic-based participants, which 
was not as ethnically diverse nor representative as might be 
desired. Results of the utility of our proposed scale should be 
verified in other population context to avoid the bias of "pre- 
selected patients". There is no correlation analysis between the 
moment of making V2 and the moment of the diagnosis of 
dementia. Moreover, AD was diagnosed as probable AD 
according to the NINCDS-ADRDA/NINCDS-AIREN criteria, 
and there were no distinctive biomarkers such as beta-amyloid or 
position-emission tomography (PET), so error could not be 
avoided. 

Key points 

Seven items of clock drawing test may consist of a simple clock- 
drawing scoring system in foUow-up studies to evaluate whether 
individuals with no dementia had progressed to dementia. 
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Table 8. Discrinnination of different scoring systems between non-converters and converters. 







Area Under 
the Curve 


Asymptotic Sig 


Asymptotic 95% 
Confidence interval 


Cut-off 


Sensitivity 


Specificity 


MOCA-CDT 


0.601 


0.110 


0.479-0.723 


<2 


92.9 


28.6 


Pfizer Inc. and Eisai Inc 


0.579 


0.210 


0.457-0.702 


<3 


69.0 


38.1 


Shulman 


0.589 


0.161 


0.465-0.713 


S3 


88.1 


33.3 


Watson 


0.571 


0.260 


0.447-0.696 


>1 


54.8 


40.5 


Sunderland 


0.622 


0.054 


0.498-0.746 


S8 


73.8 


47.6 


BCCDT 


0.713 


0.001 


0.602-0.825 


5=5 


78.6 


57.1 



doi:1 0.1 371 /journal.pone.0097873.t008 



It was difficult to predict dementia using the 1 8 items of clock 
drawing test. 

Poorly finished items of clock drawing test were not always 
sensitive to predict dementia. 

Some items of clock drawing test were not helpful to improve 
forecast value of dementia. 
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